A Probabilistic Classification System for Predicting the Cellular Localization Sites of Proteins

نویسندگان

  • Paul Horton
  • Kenta Nakai
چکیده

We have defined a simple model of classification which combines human provided expert knowledge with probabilistic reasoning. We have developed software to implement this model and have applied it to the problem of classifying proteins into their various cellular localization sites based on their amino acid sequences. Since our system requires no hand tuning to learn training data, we can now evaluate the prediction accuracy of protein localization sites by a more objective cross-validation method than earlier studies using production rule type expert systems. 336 E. coli proteins were classified into 8 classes with an accuracy of 81% while 1484 yeast proteins were classified into 10 classes with an accuracy of 55%. Additionally we report empirical results using three different strategies for handling continuously valued variables in our probabilistic reasoning system.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Probabilistic Classi cation System for Predicting the Cellular Localization Sites of Proteins

We have de ned a simple model of classi cation which combines human provided expert knowledge with probabilistic reasoning. We have developed software to implement this model and have applied it to the problem of classifying proteins into their various cellular localization sites based on their amino acid sequences. Since our system requires no hand tuning to learn training data, we can now eva...

متن کامل

Localization and Study of Histochemical Effects

A high capacity for accumulation of Mn was reported for sunflower plants. Localization of excess Mn is therefore of special interest for understanding metal tolerance mechanisms in this species. In this study, structural and histochemical alterations caused by Mn accumulation in leaves were investigated in sunflower (Helianthus annuus L. cv. Azar-ghol) plants grown in nutrient solution. In the ...

متن کامل

Predicting the Cellular Localization Sites of Proteins Using Decision Tree and Neural Networks

In this paper, we describe the implementation of a set of machine learning techniques: Decision Tree, Perceptrons, Two-layer feed-forward Neural Networks. We describe the application of these techniques to the problem of classifying proteins into their various cellular localization sites based on their amino acid sequences. We evaluate the performance of each technique by analyzing the predicti...

متن کامل

I-49: Human Y Chromosome ProteomeProject

The success of the Human Genome Project (HGP) has provided a blueprint for the approximately 20,000 gene-encoded proteins potentially active in all of the hundreds of cell types that make up the human body. Yet we still have limited knowledge about a majority of the gene-encoded proteins which are the “building blocks of life” and “cellular machinery”. It is estimated that for nearly half of th...

متن کامل

Dysregulated Expression and Sub cellular Localization of Base Excision Repair (BER) Pathway Enzymes in Gallbladder Cancer

Base excision repair (BER) pathway is one of the repair systems that have an impact on the radiotherapy and chemotherapy for the cancer patients. The molecular pathogenesis of gallbladder cancer is not known extensively. In the present study we investigated whether the expression of AP endonuclease 1 (APE1) and DNA polymerase β (DNA pol β), key enzymes of BER pathway has any clinical ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Proceedings. International Conference on Intelligent Systems for Molecular Biology

دوره 4  شماره 

صفحات  -

تاریخ انتشار 1996